Unraveling complex temporal associations in cellular systems across multiple time-series microarray datasets
نویسندگان
چکیده
Unraveling the temporal complexity of cellular systems is a challenging task, as the subtle coordination of molecular activities cannot be adequately captured by simple mathematical concepts such as correlation. This paper addresses the challenge with a data-mining approach. We introduce the novel concept of a "frequent temporal association pattern" (FTAP): a set of genes simultaneously exhibit complex temporal expression patterns recurrently across multiple microarray datasets. Such temporal signals are hard to identify in individual microarray datasets, but become significant by their frequent occurrences across multiple datasets. We designed an efficient two-stage algorithm to identify FTAPs. First, for each gene we identify expression trends that occur frequently across multiple datasets. Second, we look for a set of genes that simultaneously exhibit their respective trends recurrently in multiple datasets. We applied this algorithm to 18 yeast time-series microarray datasets. The majority of FTAPs identified by the algorithm are associated with specific biological functions. Moreover, a significant number of patterns include genes that are functionally related but do not exhibit co-expression; such gene groups cannot be captured by clustering algorithms. Our approach offers advantages: (1) it can identify complex associations of temporal trends in gene expression, an important step towards understanding the complex mechanisms governing cellular systems; (2) it is capable of integrating time-series data with different time scales and intervals; and (3) it yields results that are robust against outliers.
منابع مشابه
MINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS
This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملLearning Multiple Temporal Matching for Time Series Classification
In real applications, time series are generally of complex structure, exhibiting different global behaviors within classes. To discriminate such challenging time series, we propose a multiple temporal matching approach that reveals the commonly shared features within classes, and the most differential ones across classes. For this, we rely on a new framework based on the variance/covariance cri...
متن کاملSchemas of Clustering
Data mining techniques, such as clustering, have become a mainstay in many applications such as bioinformatics, geographic information systems, and marketing. Over the last decade, due to new demands posed by these applications, clustering techniques have been significantly adapted and extended. One such extension is the idea of finding clusters in a dataset that preserve information about some...
متن کاملMultiple gene expression profile alignment for microarray time-series data clustering
MOTIVATION Clustering gene expression data given in terms of time-series is a challenging problem that imposes its own particular constraints. Traditional clustering methods based on conventional similarity measures are not always suitable for clustering time-series data. A few methods have been proposed recently for clustering microarray time-series, which take the temporal dimension of the da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of biomedical informatics
دوره 43 4 شماره
صفحات -
تاریخ انتشار 2010